41 research outputs found

    Model-based evolutionary algorithms

    Get PDF

    Learning and exploiting mixed variable dependencies with a model-based EA

    Get PDF
    Mixed-integer optimization considers problems with both discrete and continuous variables. The ability to learn and process problem structure can be of paramount importance for optimization, particularly when faced with black-box optimization (BBO) problems, where no structural knowledge is known a priori. For such cases, model-based Evolutionary Algorithms (EAs) have been very successful in the fields of discrete and continuous optimization. In this paper, we present a model-based EA which integrates techniques from the discrete and continuous domains in order to tackle mixed-integer problems. We furthermore introduce the novel mechanisms to learn and exploit mixed-variable dependencies. Previous approaches only learned dependencies explicitly in either the discrete or the continuous domain. The potential usefulness of addressing mixed dependencies directly is assessed by empirically analyzing algorithm performance on a selection of mixed-integer problems with different types of variable interactions. We find substantially improved, scalable performance on problems that exhibit mixed dependencies

    GAMBIT: A parameterless model-based evolutionary algorithm for mixed-integer problems

    Get PDF
    Learning and exploiting problem structure is one of the key challenges in optimization. This is especially important for black-box optimization (BBO) where prior structural knowledge of a problem is not available. Existing model-based Evolutionary Algorithms (EAs) are very efficient at learning structure in both the discrete, and in the continuous domain. In this article, discrete and continuous model-building mechanisms are integrated for the Mixed-Integer (MI) domain, comprising discrete and continuous variables. We revisit a recently introduced model-based evolutionary algorithm for the MI domain, the Genetic Algorithm for Model-Based mixed-Integer opTimization (GAMBIT). We extend GAMBIT with a parameterless scheme that allows for practical use of the algorithm without the need to explicitly specify any parameters. We furthermore contrast GAMBIT with other model-based alternatives. The ultimate goal of processing mixed dependences explicitly in GAMBIT is also addressed by introducing a new mechanism for the explicit exploitation of mixed dependences. We find that processing mixed dependences with this novel mechanism allows for more efficient optimization. We further contrast the parameterless GAMBIT with Mixed-Integer Evolution Strategies (MIES) and other state-of-the-art MI optimization algorithms from the General Algebraic Modeling System (GAMS) commercial algorithm suite on problems with and without constraints, and show that GAMBIT is capable of solving problems where variable dependences prevent many algorithms from successfully optimizing them

    Combining Model-based EAs for Mixed-Integer Problems

    Get PDF
    A key characteristic of Mixed-Integer (MI) problems is the presence of both continuous and discrete problem variables. These variables can interact in various ways, resulting in challenging optimization problems. In this paper, we study the design of an algorithm that combines the strengths of LTGA and iAMaLGaM: state-of-the-art model-building EAs designed for discrete and continuous search spaces, respectively. We examine and discuss issues which emerge when trying to integrate those two algorithms into the MI setting. Our considerations lead to a design of a new algorithm for solving MI problems, which we motivate and compare with alternative approaches

    A Clustering-Based Model-Building EA for Optimization Problems with Binary and Real-Valued Variables

    Get PDF
    We propose a novel clustering-based model-building evolutionary algorithm to tackle optimization problems that have both binary and real-valued variables. The search space is clustered every generation using a distance metric that considers binary and real-valued variables jointly in order to capture and exploit dependencies between variables of different types. After clustering, linkage learning takes place within each cluster to capture and exploit dependencies between variables of the same type. We compare this with a model-building approach that only considers dependencies between variables of the same type. Additionally, since many real-world problems have constraints, we examine the use of different well-known approaches to handling constraints: constraint domination, dynamic penalty and global competitive ranking. We experimentally analyze the performance of the proposed algorithms on various unconstrained problems as well as a selection of well-known MINLP benchmark problems that all have constraints, and compare our results with the Mixed-Integer Evolution Strategy (MIES). We find that our approach to clustering that is aimed at the processing of dependencies between binary and real-valued variables can significantly improve performance in terms of required population size and function evaluations when solving problems that exhibit properties such as multiple optima, strong mixed dependencies and constraints

    Combining model-based EAs for Mixed-Integer problems

    Get PDF
    A key characteristic of Mixed-Integer (MI) problems is the presence of both continuous and discrete problem variables. These variables can interact in various ways, resulting in challenging optimization problems. In this paper, we study the design of an algorithm that combines the strengths of LTGA and iAMaLGaM: state-of-the-art model-building EAs designed for discrete and continuous search spaces, respectively. We examine and discuss issues which emerge when trying to integrate those two algorithms into the MI setting. Our considerations lead to a design of a new algorithm for solving MI problems, which we motivate and compare with alternative approaches

    In Search of Optimal Linkage Trees

    Get PDF
    Linkage-learning Evolutionary Algorithms (EAs) use linkage learning to construct a linkage model, which is exploited to solve problems efficiently by taking into account important linkages, i.e. dependencies between problem variables, during variation. It has been shown that when this linkage model is aligned correctly with the structure of the problem, these EAs are capable of solving problems efficiently by performing variation based on this linkage model [2]. The Linkage Tree Genetic Algorithm (LTGA) uses a Linkage Tree (LT) as a linkage model to identify the problem's structure hierarchically, enabling it to solve various problems very efficiently. Understanding the reasons for LTGA's excellent performance is highly valuable as LTGA is also able to efficiently solve problems for which a tree-like linkage model seems inappropriate. This brings us to ask what in fact makes a linkage model ideal for LTGA to be used

    Probabilistic Models for Aggregate Analysis of Non-Gaussian Data in Biomedicine

    Get PDF
    Aggregate association analysis is a popular way in genome-wide association studies (GWAS) that analyzes the association between the trait of interest and regions of functionally related genes, which has the advantage of capturing the missing heritability from the joint effects of correlated genetic variants while providing a better understanding of disease etiology from a systematic perspective. However, traditional methods lose their power for biomedical data with non-Gaussian data types. We proposed innovative statistical models to derive more accurate aggregated signals to enhance the power by taking account of the special data types. Based on general exponential family distribution assumptions, we developed supervised logistic PCA and supervised categorical PCA for pathway based GWAS and rare variant analysis. A general framework, sparse exponential family PCA (SePCA), is further developed for aggregate analyses for various types of biomedical data with good interpretation. We derived an efficient algorithm to find the optimal aggregated signals by solving its equivalent dual problem with closed-form updating rules. SePCA is extended for aggregate association analysis in hierarchical levels for better biological interpretation, from groups to individual variables. Both simulation studies and real world applications have demonstrated that our methods can achieve higher power in association analysis and population stratification by taking good care of the correlations among the non-Gaussian variables in biomedical data. Another analytic issue in aggregate analysis is that biomedical data often have special stratified data structures due to the experiment design to solve confounding issues. We extended SePCA to low-rank and full-rank matched models to take account of the stratified data structures. The simulation study has demonstrated their capability of reconstructing more relevant PCs for the signals of interest compared to standard ePCA. A sparse low-rank matched PCA model outperforms the existing Bayesian methods in detecting differentially expressed genes for a benchmark spike-in gene study with technical replicates. In summary, our proposed statistical models for non-Gaussian biomedical data can derive more accurate and robust aggregated signals that help reveal underlying biological principles of human disease. Other than bioinformatics, these probabilistic models also have rich applications in data mining, computer vision, and social science areas

    Niching an estimation-of-distribution algorithm by hierarchical Gaussian mixture learning

    Get PDF
    Estimation-of-Distribution Algorithms (EDAs) have been applied with quite some success when solving real-valued optimization problems, especially in the case of Black Box Optimization (BBO). Generally, the performance of an EDA depends on the match between its driving probability distribution and the landscape of the problem being solved. Because most well-known EDAs, including CMA-ES, NES, and AMaLGaM, use a uni-modal search distribution, they have a high risk of getting trapped in local optima when a problem is multi-modal with a (moderate) number of relatively comparable modes. This risk could potentially be mitigated using niching methods that define multiple regions of interest where separate search distributions govern sub-populations. However, a key question is how to determine a suitable number of niches, especially in BBO. In this paper, we present a novel, adaptive niching approach that determines the niches through hierarchical clustering based on the correlation between the probability densities and fitness values of solutions. We test the performance of a combination of this niching approach with AMaLGaM on both new and well-known niching benchmark problems and ind that the new approach properly identifies multiple landscape modes, leading to much beter performance on multi-modal problems than with a non-niched, uni-modal EDA
    corecore